Skip to content

🤖 fix: prevent MCP schema panic in control plane status tool#58

Merged
ThomasK33 merged 1 commit into
mainfrom
fix-mcp-panic
Feb 12, 2026
Merged

🤖 fix: prevent MCP schema panic in control plane status tool#58
ThomasK33 merged 1 commit into
mainfrom
fix-mcp-panic

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

Fix MCP startup panic by replacing metav1.Condition in get_control_plane_status tool output with a JSON-schema-safe summary type, add a regression test to ensure NewServer does not panic during tool registration, and align local lint tooling with CI by using vendored golangci-lint v2 via go tool.

Background

coder-k8s could panic on startup when MCP registered get_control_plane_status because the output schema exposed metav1.Condition (which includes metav1.Time embedding time.Time). The schema inference path rejects that embedded custom schema shape, crashing MCP registration and preventing app startup.

Implementation

  • Added controlPlaneConditionSummary and changed getControlPlaneStatusOutput.Conditions to []controlPlaneConditionSummary.
  • Added summarizeMetav1Conditions(...) helper to convert condition fields and stringify LastTransitionTime as RFC3339Nano.
  • Updated get_control_plane_status handler to return summarized conditions.
  • Added internal/app/mcpapp/server_test.go with TestNewServerDoesNotPanic using fake clients.
  • Updated Makefile lint target to use vendored golangci-lint v2:
    • go tool golangci-lint run ./...
    • go tool golangci-lint fmt --diff

Validation

  • make verify-vendor
  • make test
  • make build
  • make lint

Risks

Low to medium. The API shape for a single MCP tool response changed from raw Kubernetes condition objects to summarized condition objects, but this is intentional to prevent startup panics and is scoped to MCP output formatting.


📋 Implementation Plan

Fix MCP tool schema panic (get_control_plane_status)

Context / Why

Running coder-k8s in --app=all (the default in deploy/deployment.yaml) currently crash-loops with a panic during MCP tool registration:

panic: AddTool: tool "get_control_plane_status": output schema: ForType(mcpapp.getControlPlaneStatusOutput): computing element schema: custom schema for embedded struct must have type "object", got "string"

This prevents the operator stack from starting at all (controller + aggregated API server + MCP HTTP run in the same pod).

Evidence

  • internal/app/mcpapp/tools.go:getControlPlaneStatusOutput currently returns []metav1.Condition in the tool output.

  • metav1.Condition contains metav1.Time fields, and metav1.Time embeds time.Time.

  • The vendored schema generator used by MCP (vendor/github.com/google/jsonschema-go/jsonschema) has a hard-coded custom schema for time.Time with Type: "string" and rejects embedded structs whose custom schema isn’t Type: "object".

    • The failure occurs in vendor/github.com/google/jsonschema-go/jsonschema/infer.go around the embedded-field check:

      if field.Anonymous {
          override := schemas[field.Type]
          if override != nil {
              if override.Type != "object" {
                  return nil, fmt.Errorf(
                      `custom schema for embedded struct must have type "object", got %q`,
                      override.Type,
                  )
              }
          }
      }
  • Scan of internal/app/mcpapp/**/*.go confirms getControlPlaneStatusOutput is the only MCP tool output type that exposes metav1.Condition directly; all other tools already use “summary” structs with stringified timestamps.

Implementation plan

1) Replace metav1.Condition in MCP output with a summary type

File: internal/app/mcpapp/tools.go

  1. Introduce a safe condition summary struct that does not include any Kubernetes time types:
// NOTE: We cannot return metav1.Condition directly because metav1.Time embeds time.Time,
// which causes jsonschema-go schema inference to fail for MCP tool schemas.
type controlPlaneConditionSummary struct {
	Type               string `json:"type"`
	Status             string `json:"status"`
	Reason             string `json:"reason,omitempty"`
	Message            string `json:"message,omitempty"`
	ObservedGeneration int64  `json:"observedGeneration,omitempty"`
	LastTransitionTime string `json:"lastTransitionTime,omitempty"`
}
  1. Update getControlPlaneStatusOutput to use this summary type:
type getControlPlaneStatusOutput struct {
	ObservedGeneration int64                        `json:"observedGeneration"`
	ReadyReplicas      int32                        `json:"readyReplicas"`
	URL                string                       `json:"url"`
	Phase              string                       `json:"phase"`
	Conditions         []controlPlaneConditionSummary `json:"conditions"`
}

2) Convert conditions when handling the tool call

File: internal/app/mcpapp/tools.go

Add a small helper to keep tool handlers clean and ensure consistent formatting:

func summarizeMetav1Conditions(in []metav1.Condition) []controlPlaneConditionSummary {
	out := make([]controlPlaneConditionSummary, 0, len(in))
	for _, cond := range in {
		summary := controlPlaneConditionSummary{
			Type:               cond.Type,
			Status:             string(cond.Status),
			Reason:             cond.Reason,
			Message:            cond.Message,
			ObservedGeneration: cond.ObservedGeneration,
		}
		if !cond.LastTransitionTime.IsZero() {
			summary.LastTransitionTime = cond.LastTransitionTime.Time.UTC().Format(time.RFC3339Nano)
		}
		out = append(out, summary)
	}
	return out
}

Then update the get_control_plane_status tool handler to return:

Conditions: summarizeMetav1Conditions(controlPlane.Status.Conditions),

3) Add a regression test to prevent future panics

File: internal/app/mcpapp/server_test.go (new)

Goal: ensure that tool registration (schema inference) does not panic.

Test shape:

  • Build a fake controller-runtime client with newScheme().
  • Use k8s.io/client-go/kubernetes/fake.NewSimpleClientset().
  • Call NewServer(k8sClient, clientset) inside a defer recover() block and fail the test if it panics.

This catches:

  • reintroducing metav1.Condition / metav1.Time / other embedded-time types into MCP tool IO structs
  • any future schema inference regressions

4) Validation / smoke checks

Run locally:

  • make test
  • make lint (or at minimum GOFLAGS=-mod=vendor golangci-lint run if that’s the repo norm)

Optional runtime smoke:

  • GOFLAGS=-mod=vendor go run . --app=mcp-http (should start without panic)
  • Rebuild and load the image into kind and confirm the operator pod becomes Ready when running --app=all.

Acceptance criteria

  • coder-k8s no longer panics on startup when running --app=all.
  • MCP HTTP mode starts successfully, and the get_control_plane_status tool returns conditions (in summarized form).
  • go test ./... (via make test) passes.

Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh

Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh • Cost: $1.13

Replace get_control_plane_status conditions with a summary schema that avoids metav1.Time embedding issues, add a non-panic NewServer regression test, and switch make lint to go tool golangci-lint so local lint uses the vendored v2 toolchain.

---

_Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$1.13`_

<!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=1.13 -->
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Please review this change.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue Feb 12, 2026
@ThomasK33
Copy link
Copy Markdown
Member Author

Merged via the queue into main with commit be6654a Feb 12, 2026
8 checks passed
@ThomasK33 ThomasK33 deleted the fix-mcp-panic branch February 12, 2026 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant